Journal of Magnetic Resonance Imaging
○ Wiley
Preprints posted in the last 30 days, ranked by how well they match Journal of Magnetic Resonance Imaging's content profile, based on 10 papers previously published here. The average preprint has a 0.08% match score for this journal, so anything above that is already an above-average fit.
McCullum, L.; Ding, Y.; Fuller, C. D.; Taylor, B. A.
Show abstract
Background and Purpose: Magnetic resonance imaging (MRI) for radiation therapy treatment planning is currently being used in many anatomical sites to better visualize soft tissue landmarks, a technique known as an MRI simulation. A core component of modern MRI simulation configurations are the use of external laser positioning systems (ELPS) to help set up the patient. Though necessary for accurate and reproducible patient setup, the ELPS, if left on during imaging, may interfere negatively with image quality due to leaking electronic noise, of which MRI is sensitive to. It is currently unknown whether this leakage of electronic noise may further affect quantitative values derived from clinically employed relaxometric, diffusion, and fat fraction sequences. Therefore, in this study, we aim to characterize the impact of MRI simulation lasers on general image quality and quantitative imaging accuracy. Materials and Methods: First, a cine acquisition was used to visualize the real-time changes in image signal-to-noise ratio (SNR) from when the ELPS was deactivated to activated. To validate this effect quantitatively, the SNR was measured using the American College of Radiology (ACR) recommended protocol in a homogeneous phantom with the integrated body, 18-channel UltraFlex small, 18-channel UltraFlex large, 32-channel spine, and 16-channel shoulder coils. Next, a geometric distortion algorithm was tested in two vendor-provided phantoms while using the integrated body coil and the ACR Large Phantom protocol was tested. Finally, a series of quantitative MRI scans were performed using a CaliberMRI Model 137 Mini Hybrid phantom to validate quantitative T1, T2, and ADC while a Calimetrix PDFF-R2* phantom was used for quantitative PDFF and R2*. All scans were performed with both the ELPS both deactivated and activated. Results: Visible electronic noise artifacts were seen when using the integrated body coil when the ELPS was activated on the cine acquisition which led to a four-fold decrease in SNR using the ACR protocol. This SNR drop was not seen when using the remaining tested coils. The automatic fiducial detection algorithm was affected negatively by ELPS activation leading to misidentification when identified perfectly with the ELPS deactivated. Degradation in image intensity uniformity, percent signal ghosting, and low contrast object detectability was seen during ACR Large Phantom testing using the 20-channel Head/Neck coil. Concordance across quantitative MRI values was similar when the ELPS was both deactivated and activated while a consistent increase in standard deviation inside the ADC vials was seen when the ELPS was activated. Discussion: The extra noise induced from the activation of the ELPS during imaging should be avoided due to its potential to unnecessarily increase image noise. This is particularly true when conducting mandatory quality assurance testing for image quality and geometric distortion which utilize the integrated body coil which is most susceptible to ELPS-induced noise. Clear clinical guidelines should be implemented to make this issue known to the MRI technologists, physicists, and other relevant staff using an MRI with a supplementary ELPS for patient alignment.
Miyata, M.; Tomiyasu, M.; Sahara, Y.; Tsuchiya, H.; Maeda, T.; Tomoyori, N.; Kawashima, M.; Kishimoto, R.; Mizota, A.; Kudo, K.; Obata, T.
Show abstract
PurposeAqueous humor drains fluid from the eye not only via the conventional pathway through the trabecular meshwork and Schlemms canal, but also within the eye is known to occur via pathways through the posterior chamber and optic nerve to the cerebrospinal fluid (CSF) surrounding the optic nerve. The mechanism is poorly understood, and non-invasive method for evaluation in living humans has not been established. We previously showed that eye drops containing O-17-labeled water (H217O) distribute in the anterior chamber but not the vitreous. This study aimed to evaluate the distribution of H217O in the CSF along the optic nerve. MethodsFive ophthalmologically normal participants (20-31 years, all females) were selected from a previous prospective study based on 1H MR images of the eyes that included the optic nerve. They received eye drops of 10 mol% H217O in their right eye. Dynamic image time series was created by normalizing the signal of each 1H-T2WI by the pre-drop average signal. Region-of-interest analyses were performed for signal changes in the anterior chamber, vitreous, and CSF. ResultsIn the quantitative evaluation, the normalized intensity in the anterior chamber and CSF was significantly lower than that in the pre-drop signal (anterior chamber: 0.78 {+/-} 0.07, p < 0.005; CSF: 0.89 {+/-} 0.07, p < 0.05). No distribution was identified in the vitreous. Qualitatively, the distribution of H217O in the anterior chamber was detected in all five participants and in the CSF of four participants (80%). ConclusionH217O eye drops were distributed in the anterior chamber and CSF, but not in the vitreous. These findings suggest that the visualization of aqueous humor outflow, not via the Schlemms canal, may contribute to ocular fluid homeostasis, including the ocular glymphatic system.
Readford, T. R.; Martinez, G. J.; Patel, S.; Kench, P. L.; Andia, M. E.; Ugander, M.; Giannotti, N.
Show abstract
BackgroundDynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) enables non-invasive characterization of carotid atherosclerotic plaque. PurposeTo evaluate the performance and reproducibility of a simplified DCE-MRI quantification method for carotid plaque assessment. MethodsT1-weighted black-blood DCE-MRI of the carotid arteries at 3T was performed at baseline and after six months in patients with mild-to-moderate atherosclerotic lesions in a pilot placebo-controlled randomized trial evaluating the effects of low-dose (0.5mg daily) colchicine therapy on carotid plaque volume. DCE-MRI signal intensity was measured in manually drawn regions of interest in the plaque core, remote non-atherosclerotic vessel wall, and skeletal muscle. Peak signal intensities were normalized to skeletal muscle signal in the same slice. ResultsIn patients (n=28, median [interquartile range] age 72 [64-74] years, 36% female, n=13/15 colchicine/placebo), normalized peak signal intensity was higher in the plaque core than in the remote vessel wall at both baseline (3.5 [2.3-4.1] vs 2.1 [1.7-2.5], p<0.001) and follow-up (3.2 [2.5-4.4] vs 2.0 [1.7-2.5], p<0.001). Measurements did not differ between baseline and follow-up for all patients (0.7{+/-}0.7 for plaque core, 0.6{+/-}0.4 for remote vessel wall, p>0.80 for both) nor between colchicine intervention and placebo control (p>0.35 for either region). ConclusionsNormalised peak signal intensity on DCE-MRI was consistently higher in the carotid plaque core than in the remote vessel wall, showed excellent reproducibility in both regions over six months, and was not altered by colchicine treatment. This simplified, muscle-normalised approach may facilitate future studies exploring DCE-MRI measures potentially related to plaque vulnerability.
Haueise, T.; Machann, J.
Show abstract
Chemical shift-encoded magnetic resonance imaging using high-resolved 3D Dixon techniques enables the non-invasive and radiation-free assessment of whole-body adipose tissue and ectopic fat distribution. Automatic deep learning-based segmentation of metabolically relevant adipose tissue compartments and ectopic fat deposits in parenchymal tissue is the most important image processing step for the quantification of adipose tissue volumes and ectopic fat percentages from whole-body imaging. This work presents a segmentation model dedicated to the segmentation of 19 metabolically relevant adipose tissue compartments and ectopic fat deposits from whole-body Dixon MRI. The trained segmentation model is available upon request. Related post-processing routines to compute volumes and fat percentages are publicly available: https://github.com/tobihaui/WholeBodyATQuantification.
Hoe, Z. Y.; Ding, R.-S.; Chou, C.-P.; Hu, C.; Lee, C.-H.; Tzeng, Y.-D.; Pan, C.-T.; Lee, M.-C.; Lee, E. K.-L.
Show abstract
BackgroundBreast cancer-related lymphedema (BCRL) is a common complication following breast cancer treatment. While lymphoscintigraphy is considered the diagnostic gold standard, it is unsuitable for routine periodic monitoring or assessment of treatment efficacy. Shear wave elastography (SWE) offers a possible alternative, but traditional modes of operation limit its potential. Proposed SolutionsThe Holder-Optimized Elastography (HOE) method is introduced to eliminate pressure issues introduced by manual operation of ultrasound probes by stabilizing them above the cutis. MethodsThe HOE method was used to acquire ARFI images of high-velocity areas (HVAs, with shear wave velocity greater than 7 m/s) in limbs with and without BCRL (as confirmed and characterized by lymphoscintigraphy) in two cohorts of 15 and 125 patients. ResultsThe HOE method enabled ARFI elastography to directly and consistently visualize the effects caused by both obstructed lymphatic vessels and intraluminal lymphatic fluid as HVAs, whereas traditional hand-held methods did not. Inter-limb differences in HVA burden showed moderate diagnostic performance for detecting BCRL and grading obstruction with modest sensitivity. However, there was systematic underestimation of both early and confluent advanced lesions. ConclusionHOE-based HVA imaging has potential for rapid and non-invasive monitoring of lymphedema course and treatment response and may serve as a useful adjunct to existing diagnostic tools for BCRL. However, further technical refinements and quantitative analytic methods will be required to fully exploit the richer SWV information provided by HOE and to enhance the diagnostic utility of HVAs. Summary StatementThe Holder-Optimized Elastography method ("HOE" method) increases the diagnostic capability of ARFI elastography for breast cancer-related lymphedema, allowing for the non-invasive detection of some lymphatic obstructions but not all. Key ResultsThe Holder-Optimized Elastography (HOE) method revealed the effects caused by fluid-filled lymphatic vessels as "High-Velocity Areas" (HVAs), which are difficult to detect by conventional methods. HVA counts for detecting lymphedema (any obstruction vs. no obstruction) showed high specificity (0.86-1.00) but low sensitivity (0.57-0.67). Conversely, HVA counts for staging lymphedema (i.e. total vs. partial obstruction) showed high sensitivity (up to 1.00) but low specificity (0.48-0.66). The inter-limb difference of HVAs counted in whole-limb scans between affected and unaffected limbs (aka, the "Global Mean Difference") provided the most balanced diagnostic performance (sensitivity 0.67-0.79, specificity 0.88-0.89).
Kästingschäfer, K. F.; Fink, A.; Rau, S.; Reisert, M.; Kellner, E.; Nolde, J. M.; Kottgen, A.; Sekula, P.; Bamberg, F.; Russe, M. F.
Show abstract
Rationale and ObjectivesContrast-enhanced (CE) MRI provides clear corticomedullary contrast for renal compartment delineation but may be contraindicated or undesirable in routine practice. We aimed to enable automated extraction of renal imaging biomarkers from routine non-contrast-enhanced (NCE) T1-weighted MRI by transferring CE-derived compartment labels. Materials and MethodsThis retrospective single-center study (January 2017 to December 2021) included 200 participants with paired arterial-phase CE and NCE T1-weighted MRI. Cortex, medulla, and sinus were manually segmented on CE MRI and rigidly transferred to NCE MRI to provide voxel-level reference labels. A hierarchical 3D Deep Neural Patchworks model was trained on 100 examinations (90 training/10 validation) and evaluated on an independent test set of 100 examinations using the transferred CE masks on NCE as reference. Performance was assessed using Dice similarity of segmentations and biomarker agreement using volumes and surface areas (Pearson/Spearman, MAE, Lins CCC, and Bland-Altman). ResultsWhole-kidney segmentation Dice was 0.950 (left) and 0.953 (right). Total kidney volume showed high agreement with minimal bias (MAE 8.76 mL, 2.5% of mean; CCC 0.983; bias -1.56 mL; 95% limits of agreement -28.81 to 25.69 mL). Cortex volume was modestly overestimated and medulla volume underestimated, shifting predicted compartment fractions toward cortex (74.7% vs. 72,1% in ground truth; medulla 21.5% vs. 24.3%; sinus 3.8% vs. 3.6%. Sinus volume maintained high concordance despite higher Dice dispersion. Surface area was systematically underestimated with low concordance. ConclusionCE-supervised knowledge transfer enables accurate, well-calibrated kidney volumetry from routine NCE MRI and supports contrast-free renal biomarker extraction. Surface area estimation remains challenging. Take-home MessagesO_LICE-supervised label transfer enables accurate, well-calibrated contrast-free kidney volumetry on routine non-contrast T1-weighted MRI. C_LIO_LICompartment volumetry is feasible but shows systematic cortex overestimation and medulla underestimation; surface area remains non-interchangeable due to boundary uncertainty. C_LI
Seo, W.; Jabur Agerberg, S.; Rashid, A.; Holmstrand, N.; Nyholm, D.; Virhammar, J.; Fallmar, D.
Show abstract
IntroductionIdiopathic normal pressure hydrocephalus (iNPH) is a partially reversible neurological disorder in which imaging biomarkers support diagnosis and surgical decision-making. The callosal angle (CA) is one of the most robust radiological markers of iNPH and has also been associated with postoperative shunt outcome. However, several manual measurement variants exist and artificial intelligence (AI)-based tools now enable automatic CA measurement. Materials and MethodsIn total 71 patients (40 with confirmed iNPH and 31 controls) were included. Six predefined manual methods for measuring CA were applied to preoperative 3D T1-weighted MRI and evaluated for diagnostic performance and interobserver agreement. An AI-derived automatic CA (cMRI from Combinostics) was included as a seventh method and compared with the traditional manual method (perpendicular to the bicommissural plane and through the posterior commissure). Automatic measurements were additionally assessed in pre- and postoperative scans to evaluate robustness against shunt-related artifacts. ResultsAll seven CA variants significantly differentiated iNPH patients from controls (p < 0.05). The traditional method showed the highest discriminative performance (AUC = 0.986, SE = 0.012), while alternative planes demonstrated slightly lower accuracy (AUC range = 0.957-0.978). Interobserver agreement for manual measurements was good to excellent (ICC = 0.687-0.977). Automatic CA measurements showed excellent correlation with the traditional method, preoperative ICC = 0.92; postoperative ICC = 0.96. ConclusionAlthough several CA positions perform comparably, the traditional method remains marginally superior and is best supported by the literature. Automated CA measurements closely match expert manual assessment in pre- and postoperative imaging, supporting clinical implementation.
Hartmann, K.; Beeche, C.; Judy, R.; DePietro, D. M.; Witschey, W. R.; Duda, J.; Gee, J.; Gade, T.; Penn Medicine Biobank, ; Levin, M.; Damrauer, S. M.
Show abstract
PurposePortal hypertension, a major complication of chronic liver disease, leads to significant morbidity and mortality. While portal vein diameter measured on imaging has long been proposed as a non-invasive marker of portal hypertension, normative CT-based reference values and population-level associations remain incompletely characterized. Here, we aim to define contemporary reference values for portal vein diameter on clinically obtained CT and evaluate its associations with demographic, clinical, and imaging factors, as well as its diagnostic performance for portal hypertension. MethodsWe conducted a retrospective analysis of 20,225 clinically obtained CT scans at a single academic medical center. The main portal vein was automatically segmented using Total Segmentator, and maximum diameter extracted using the Vascular Modeling Toolkit. Associations with demographic and imaging factors were evaluated using linear mixed-effects models; prevalent liver disease and portal hypertension using logistic regression; risk of incident ascites and esophageal varices among participants with liver disease using Cox regression; and invasive hepatic venous pressures using correlation analysis and linear regression. ResultsThe mean portal vein diameter was 12.4 mm (95% CI, 12.37-12.45). Larger diameter was independently associated with male sex (+1.4 mm), higher BMI (+0.11 mm/kg/m2), greater height (+0.04 mm/cm), and older age (+0.05 mm/10 years) (all p <0.001), and was substantially larger on contrast-enhanced abdomen/pelvis CT (+2.4 mm, p <0.001). Each 1-mm increase in portal vein diameter was associated with higher odds of prevalent liver disease (OR 1.06; 95% CI, 1.04-1.08) and portal hypertension (OR 1.18; 95% CI, 1.12-1.28). Among individuals with liver disease, greater diameter predicted higher risk of incident esophageal varices (baseline diameter HR 1.50; 95% CI, 1.14-2.08) and ascites (HR per mm increase in diameter 1.06; 95% CI, 1.003-1.12). However, portal vein diameter demonstrated weak to no association with invasively measured hepatic venous pressures. ConclusionIn this large, EHR-linked imaging cohort, the mean portal vein diameter on CT was 12.4 mm and varied with demographic and imaging factors. Larger diameter was associated with liver disease, portal hypertension, and subsequent development of varices and ascites, supporting use of portal vein diameter as a pragmatic screening or enrichment tool within multimodal clinical frameworks. Key ResultsO_LIMean portal vein diameter on routine clinical CT was 12.4 mm (95% CI, 12.37-12.45) and varied with sex, height, BMI, exam type, contrast use, and clinical setting. C_LIO_LIEach 1-mm increase in portal vein diameter was associated with higher odds of prevalent liver disease (OR 1.06) and portal hypertension (OR 1.18). C_LIO_LIAmong individuals with liver disease, larger portal vein diameter predicted higher risk of incident esophageal varices and ascites, independent of demographic and imaging factors. C_LI
Krueger, D.; Binkley, N.; Madeira, M.; Chen, Z.; Di Gregorio, S.; Del Rio, L.; Humbert, L.
Show abstract
3D-DXA reconstructs DXA hip scans to 3-dimensional images allowing measurement of trabecular and cortical bone parameters. Given the higher image quality of GE Healthcare iDXA than GE Healthcare Prodigy, it could be hypothesized that the reconstruction might differ, thereby affecting 3D-DXA results. The aim of the study was to assess agreement and precision of 3D-DXA cortical and trabecular femur parameters between Prodigy and iDXA densitometers in adult subjects. The study cohort was composed of 391 men and women recruited from 3 clinical centers (USA and Brazil). All subjects were scanned on either Prodigy or iDXA scanners. Short-term precision was assessed on two Prodigy and two iDXA densitometers. 3D-DXA analyses were performed using 3D-Shaper software version 2.14. Agreement between densitometers was assessed by regression and Bland-Altman analyses. Short-term precision was determined following International Society for Clinical Densitometry recommendations. Strong agreements for 3D-DXA parameters were obtained between devices regardless of the center or the DXA device model (all R2 > 0.96). Bland-Altman analyses demonstrated statistically (p < 0.05), but not clinically, significant difference between both aBMD and 3D-DXA measurements obtained using Prodigy and iDXA scanners. Short-term precision of areal BMD and 3D-DXA parameters was similar between densitometers. This study demonstrated excellent 3D-DXA measurement agreement and similar precision between iDXA and Prodigy densitometers. These data provide evidence that no adjustments are required when using 3D-Shaper software on iDXA or Prodigy instruments. Mini AbstractWe assessed agreement and precision of 3D-DXA parameters between GE Healthcare Prodigy and iDXA densitometers in adults. Strong agreement was observed between devices, and short-term precision was comparable. Findings indicate that no adjustment is needed when using 3D-DXA with GE Healthcare densitometers.
Fink, A.; Burzer, F.; Sacalean, V.; Rau, S.; Kaestingschaefer, K. F.; Rau, A.; Koettgen, A.; Bamberg, F.; Jaenigen, B.; Russe, M. F.
Show abstract
BackgroundKidney volumetry derived from CT has been proposed as a surrogate of renal function in living kidney donor evaluation. However, clinical integration has been limited by reader-dependent workflows and semiautomatic methods susceptible to image quality. PurposeTo evaluate whether fully automated CT-based segmentation of renal cortex, medulla and total parenchymal volume provides reproducible volumetric biomarkers associated with global and split renal function in living kidney donor candidates. Materials and MethodsIn this retrospective single-center study, 461 living kidney donor candidates (2003-2021) underwent contrast-enhanced abdominal CT. A convolutional neural network was trained to automatically segment cortical, medullary, and total parenchymal volumes on arterial-phase images. Segmentation performance was evaluated against manual reference annotations. Volumes were indexed to body surface area. Associations with eGFR, 24-hour creatinine clearance, cystatin C, and tubular clearance were assessed using Spearman correlation coefficient ({rho}), and side-specific volume fractions were compared with scintigraphy -derived split function. ResultsAutomated segmentation achieved excellent agreement with expert reference segmentations (Dice 0.95 for cortex; 0.90 for medulla). eGFR correlated moderately with cortical ({rho} = 0.46) and total parenchymal volume ({rho} = 0.45), and modestly with medullary volume ({rho} = 0.30). Similar associations were observed for other global measures, with the strongest correlation for cortical volume and tubular clearance ({rho} = 0.53). Side-specific volume fractions correlated with scintigraphy-derived split renal function ({rho} = 0.49-0.56; all p < 0.001). ConclusionAutomated CT-based renal subcompartment segmentation provides reproducible volumetric biomarkers within routine donor evaluation. Cortical volume performs comparably to total parenchymal volume and tracks split renal function at the cohort level, suggesting potential utility in donor assessment.
Lettner, J. D.; Evrenoglou, T.; Binder, H.; Fichtner-Feigl, S.; Neubauer, C.; Ruess, D. A.
Show abstract
BackgroundAI-based radiomics has demonstrated promising diagnostic performance for pancreatic cystic neoplasms, yet clinical translation remains limited. Whether this reflects insufficient model performance or structural limitations of the evidence base remains unclear. MethodsWe performed a systematic review and diagnostic test accuracy meta-analysis of AI-based radiomics in pancreatic cyst (2015-2025), addressing two clinically relevant tasks (Q1: cyst type differentiation/Q2: malignancy or high-grade dysplasia prediction). Training and validation datasets were synthesized independently using hierarchical models. Study evaluation extended beyond diagnostic performance to a four-dimensional framework integrating RQS 2.0, METRICS, TRIPOD+AI and PROBAST+AI explicitly contrasting pooled diagnostic performance with reporting quality, methodological rigor, and risk of bias. The review was pre-registered (PROSPERO) and conducted according to PRISMA 2020. ResultsTwenty-nine studies were included (Q1: n = 15; Q2: n = 14), predominantly retrospective and single center. Training-based analyses showed high apparent diagnostic performance for Q1 (pooled sensitivity/specificity: 0.89 [95% CI, 0.85-0.92]/ 0.90 [0.85-0.93]), but there was substantial heterogeneity ({tau}{superscript 2} = 0.56/0.78; {rho} = 0.38). Validation-based performance remained high (0.86 [0.82-0.89]/ 0.88 [0.81-0.93]), while heterogeneity persisted and prediction regions exceeded confidence regions. Training-based analyses demonstrated similarly high apparent performance (0.88 [0.79-0.95]/0.89 [0.81-0.94]) for Q2, with pronounced heterogeneity ({tau}{superscript 2} = 1.98/1.61; {rho} = 0.63). Validation-based performance was slightly lower, yet still clinically comparable (0.82 [0.75-0.89]/0.86 [0.80-0.91]), and heterogeneity persisted ({tau}{superscript 2} = 0.71/0.43; {rho} = 0.15). Across both tasks, high diagnostic accuracy occurred alongside incomplete reporting, limited validation and an elevated risk of bias. ConclusionAI-based radiomics for pancreatic cysts has reached a structural performance plateau. Further improvements in diagnostic accuracy alone are insufficient to achieve clinical translation and must be accompanied by a paradigm shift from performance-driven model development toward decision-anchored study designs, robust validation strategies, transparent reporting standard, and clinically integrated evaluation frameworks. SummaryAlthough pancreatic cystic lesions are increasingly being detected, imaging-based decision-making remains limited, particularly regarding differentiating between cyst types and stratifying malignancy risk. In this PRISMA-compliant and PROSPERO-registered systematic review and meta-analysis of diagnostic tests, we evaluated the use of AI-based radiomics for these two tasks, as well as its contextualized performance. In addition, a four-dimensional framework was employed to conduct the evaluation, incorporating diagnostic accuracy, reporting quality, risk of bias, and radiomics maturity. Across studies published between 2015 and 2025, the pooled diagnostic performance was consistently high, with only modest declines observed from the training to the validation stage. Nevertheless, considerable heterogeneity between studies and limited transportability remained evident. Multidimensional evaluation indicated a systematic dissociation between reported performance and methodological robustness, characterized by incomplete reporting, restricted validation, and an elevated risk of bias. These limitations were consistent across both clinical questions and were not resolved by increasing model complexity. The findings of this meta-analysis suggest that the structural performance of AI-based radiomics for pancreatic cysts has plateaued. To progress towards clinical translation, it is necessary to employ study designs anchored in decision-making processes, robust multi-center validation, and transparent, reproducible evaluation frameworks. This is preferred to further optimization of model architecture alone.
Whitcher, B.; Raza, H.; Basty, N.; Thanaj, M.; Bell-Bradford, C.; Niglas, M.; Bell, J. D.; Thomas, E. L.; Amiras, D.
Show abstract
Quantifying muscle health at scale has been limited by the difficulty of segmenting individual muscles on MRI. We developed an automated 3D deep-learning framework that segments 20 bilateral hip and thigh muscles from Dixon MRI, enabling muscle level quantification of volume and relative fat fraction (rFF). Applied to 10,840 baseline and 2,766 longitudinal UK Biobank scans, this framework supports population-scale phenotyping across demographic, metabolic and treatment exposures. Segmentation accuracy was robust, and increased with muscle size. Men had greater muscle volumes, whereas women showed consistently higher rFF. Fat infiltration was highest in postural and pelvic-stabilising muscles and lowest in the quadriceps, revealing pronounced anatomical heterogeneity. Over two years, most muscles showed small but consistent volume declines, with losses more uniform in men and more heterogeneous in women; rFF increased more prominently in women, suggesting early compositional deterioration. In T2D, men showed widespread volume loss and elevated rFF, whereas women showed minimal volume loss and heterogeneous fat changes, revealing sex-specific disease signatures. Automated muscle-specific MRI phenotyping resolves structural and compositional changes obscured by compartment-level measures and provides a scalable platform for population-level studies of musculoskeletal ageing, metabolic disease, and therapeutic response.
Wu, J.; Perandini, L.; Batra, T.; Igoshin, S.; Bari, S.; de Araujo, A. L.; Willemink, M. J.
Show abstract
Digital breast tomosynthesis (DBT) is a powerful imaging modality that allows for improved lesion visibility, characterization, and localization compared to conventional two-dimensional digital mammography. DBT has been increasingly adopted in screening and diagnostic settings globally, particularly for women with dense breast tissue where tissue overlap presents a significant diagnostic challenge. Here we describe DBT-2026, a real world imaging dataset with 558 DBT exams from 558 patients with breast imaging reporting and data system (BI-RADS) scores of 0, 1, or 2. Each case contains one DBT examination in combination with expert annotations and free-text radiology reports that describe the radiological findings, produced in routine clinical practice. To protect patient privacy, all images and reports have been de-identified. The dataset is made freely available to researchers for non-commercial projects to facilitate and encourage research in breast cancer imaging.
Castelo, A.; O'Connor, C.; Gupta, A. C.; Anderson, B. M.; Woodland, M.; Altaie, M.; Koay, E. J.; Odisio, B. C.; Tang, T. T.; Brock, K. K.
Show abstract
Artificial intelligence (AI) based segmentation has many medical applications but limited curated datasets challenge model training; this study compares the impact of dataset annotation quality and quantity on whole liver AI segmentation performance. We obtained 3,089 abdominal computed tomography scans with whole-liver contours from MD Anderson Cancer Center (MDA) and a MICCAI challenge. A total of 249 scans were withheld for testing of which 30, MICCAI challenge data, were reserved for external validation. The remaining scans were divided into mixed-curation and highly-curated groups, randomly sampled into sub-datasets of various sizes, and used to train 3D nnU-Net segmentation models. Dice similarity coefficients (DSC), surface DSC with 2mm margins (SD 2mm), the 95th percentile of Hausdorff distance (HD95), and 2D axial slice DSC (Slice DSC) were used to evaluate model performance. The highly curated, 244-scan model (DSC=0.971, SD 2mm=0.958, HD95=2.98mm) performed insignificantly different on 3D evaluation metrics to the mixed-curation 2,840-scan model (DSC=0.971 [p>.999], SD 2mm=0.958 [p>.999], HD95=2.87mm [p>.999]). The 710-scan mixed-curation (Slice DSC=0.929) significantly outperformed the highly curated, 244-scan model (Slice DSC=0.923 [p=0.012]) on the 30 external scans. Highly curated datasets yielded equivalent performance to datasets that were a full order of magnitude larger. The benefits of larger, mixed-curation datasets are evidenced in model generalizability metrics and local improvements. In conclusion, tradeoffs between dataset quality and quantity for model training are nuanced and goal dependent.
Bhutto, D. F.; Kim, E.; Pajankar, N.; Vahedifard, F.; Daneshzand, M.; Edwards, D.; Nummenmaa, A.
Show abstract
BackgroundMotor threshold (MT) estimation is fundamental to transcranial magnetic stimulation (TMS), guiding individualized stimulation intensity in research and therapy. Conventional methods such as the 5-out-of-10 rule require many stimuli, while adaptive approaches like Parameter Estimation by Sequential Testing (PEST) improve efficiency but can exhibit poor convergence under certain conditions. ObjectiveThis study introduces the Bayesian Uncertainty Dynamic Algorithm for Parameter Estimation by Sequential Testing (BUDAPEST), a Bayesian adaptive method for fast, accurate MT estimation with user-controlled uncertainty. The aims were to validate its accuracy in simulations and human data, promote usability through a MATLAB-based graphical interface, and evaluate experimental utility through resting and active MT comparisons and session-to-session reliability. MethodsBUDAPEST infers MT from binary MEP responses using sequential Bayesian updating and terminates when a user-defined uncertainty threshold is reached. Performance was evaluated in 10,000 virtual simulations and in human rMT and aMT measurements across two sessions per subject, including 3x5 cortical motor mapping to assess physiological spatial patterns. ResultsIn simulations, BUDAPEST achieved a mean absolute error of 1.9% MSO within ~10 pulses using a 2% uncertainty criterion while avoiding PEST misestimations. In human data, MT estimates were accurate within {+/-}4% MSO and robust to initialization; rMT showed strong session-to-session reliability (r = 0.78), whereas aMT exhibited greater variability. Motor mapping revealed coherent excitability gradients centered on the hotspot. ConclusionBUDAPEST enables rapid, reliable, and uncertainty-controlled MT estimation while reducing procedure time and participant burden. The accompanying GUI facilitates immediate adoption in research and clinical TMS environments. HighlightsO_LIIntroduces BUDAPEST, a Bayesian uncertainty-aware algorithm for rapid and reliable TMS motor threshold estimation. C_LIO_LIAchieves accurate MT estimates ({approx}2% MSO error) in ~10 pulses with user-controlled trade-offs between precision and procedure duration. C_LIO_LIDemonstrates robust performance in simulations and human data, with strong resting MT reliability and an open-source GUI enabling immediate adoption. C_LI
Wu, Z.; Mazzola, C. A.; Goodman, A.; Gao, Y.; Alvarez, T.; Li, X.
Show abstract
Traumatic brain injury (TBI), particularly sports- and recreational activity related mild TBI (mTBI), is common in young adults and can be followed by persistent attentional and executive complaints. This study investigated chronic ([≥]6 months post-injury) structural brain alterations in gray matter (GM) and white matter (WM) and their associations with self-reported inattentive and hyperactive/impulsive symptoms, with a focus on sex-differentiated patterns. Structural brain properties in gray matter (GM) and white matter (WM) were acquired from 44 subjects with TBI and 45 matched controls, by utilizing structural MRI and diffusion tensor imaging techniques. Behavioral measures assessing severities of post TBI inattentive and hyperactive/impulsive symptoms were collected from each participant. Between-group and sex-specific differences of these brain and behavioral measures were conducted. Interactions among the TBI-induced significant brain- and behavioral-alterations, and their sex-specific patterns, were assessed as well. Male-dominated pattern of increased cortical thickness in superior parietal lobule (SPL) and female-dominated pattern of higher superior longitudinal fasciculus and superior fronto-occipital fasciculus (sFOF) fractional anisotropy (FA) were observed in the TBI group, when compared to controls. In males with TBI, greater SPL cortical thickness was significantly correlated with increased inattentive behaviors. In females with TBI, higher FA of sFOF was significantly correlated with decreased hyperactive/impulsive behaviors. Findings suggest that TBI-induced superior parietal cortical GM abnormalities may significantly cause attention deficits in patients with TBI, especially in males; while optimal post-TBI WM recovery in sFOF significantly contributes to maintenance of inhibitive control in patients with TBI, especially in females.
Raghu, N.; Abbasi, M.; Tashi, Z.; Zamora, C.; Key, S.; Chong, C. D.; Zhou, Y.; Niklova, S.; Ofori, E.; Bartelle, B. B.
Show abstract
Magnetic Resonance Spectroscopy Imaging (MRSI) offers spatially-resolved, neurometabolic information, acquired non-invasively at whole-brain scales from human subjects. Analysis of MRSI however, is extremely challenging. The metabolic information is highly convolved, and sparsely distributed across millions of spatial-spectral datapoints, allowing for little direct human interpretation. Conversely, the overall low signal-to-noise with high-intensity artifacts can confound unsupervised machine learning approaches. These technical barriers have left much of the potential of MRSI unrealized. We acquired MRSI data from 4 human subjects with a diagnosis of multiple sclerosis (MS), incorporating experimental design into an informed machine learning approach. MRSI acquisitions were registered to anatomical MRI to label 105k spectra from brain tissue and 162 spectra from white matter hyperintensities (WMHs), an imaging biomarker associated with MS lesions. Spectral labels were then used in contrastive principal component analysis (cPCA) to filter artifacts and background features in the MRSI data from lesion salient features and clustered into statistically significant states based on features that could be interpreted from the original data. Our approach renders MRSI data into testable representations of neurometabolism, enabling the method for fundamental and clinical research. Graphical AbstractAnalysis workflow for neurometabolic profiling of MS lesions. MRSI and anatomical MRI is acquired and processed in parallel for spectral data and anatomical labels. Spectra are then labeled and separated into experimental vs background data for contrastive PCA. Spectra are clustered for similarity, further labeled, and projected onto a brain atlas for a neurometabolic view. O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=71 SRC="FIGDIR/small/26346248v1_ufig1.gif" ALT="Figure 1"> View larger version (28K): org.highwire.dtl.DTLVardef@21a1eorg.highwire.dtl.DTLVardef@e312org.highwire.dtl.DTLVardef@3bce70org.highwire.dtl.DTLVardef@6e56ae_HPS_FORMAT_FIGEXP M_FIG C_FIG
Loeffen, D. W. M.; Rijpma, A.; Bartels, R. H. M. A.; Vinke, R. S.
Show abstract
Deep-learning based super-resolution has shown promise for enhancing the spatial resolution of brain magnetic resonance images, which may help visualize small anatomical structures more clearly. However, when only limited training data are available, it remains uncertain which model assessment method provides the most reliable estimate of out-of-sample performance. In this study, three widely used assessment strategies (three-way holdout, k-fold cross-validation, and nested cross-validation) were compared for evaluating the performance of such models in small datasets. Across 30 iterations, we randomly selected subsets of 20 T2-weighted images from the 1,113 scans of the Human Connectome Project. Each subset was used to train a model and estimate performance using the three methods. The ground truth error was computed from the remaining images. The assessment error is the difference between the estimated error and the ground truth error. The median assessment errors were 0.11,- 0.13, - 0.32 for three-way holdout, k-fold cross-validation, and nested cross-validation, respectively, with the cross-validation methods showing considerably smaller dispersions. Nested cross-validation selected fewer epochs, indicating more conservative model selection, but required substantially greater computational time, over three times longer than three-way holdout and more than twenty times longer than k-fold cross-validation. Our findings suggest that k-fold cross-validation offers the most favourable balance between accuracy, stability, and computational feasibility in small datasets. Further research is needed to determine how model complexity, dataset size, and the number of cross-validation folds influence assessment accuracy.
Xie, C.; Wang, Y.; Li, D.; Yu, B.; Peng, S.; Wu, L.; Yang, M.
Show abstract
Handheld ultrasound devices have revolutionized point-of-care diagnostics, but their effectiveness remains limited by operator dependency and the need for specialized training. This paper presents an intelligent guidance and diagnostic assistance system for the handheld wireless ultrasound device, enabling automated carotid artery and thyroid examinations through handheld operation. Drawing inspiration from the Actor-Critic framework, we implement a simulation-based reinforcement learning approach for real-time probe navigation toward standard anatomical views. The system integrates YOLOv8n-based detection networks for carotid plaque and thyroid nodule identification, achieving real-time inference at 30 frames per second. Furthermore, we propose a hybrid measurement approach combining UNet segmentation with the Snake algorithm for precise biometric quantification, including carotid intima-media thickness (IMT), lumen diameter, and lesion dimensions. Experimental validation on clinical datasets demonstrates that the proposed system achieves 91.2% accuracy in standard plane acquisition, 87.5% mean average precision (mAP) for plaque detection, and 89.3% mAP for nodule identification. Measurement results show excellent agreement with expert sonographers, with IMT measurements exhibiting a mean absolute difference of 0.08 mm. These findings demonstrate the feasibility of intelligent handheld ultrasound examination, significantly reducing operator dependency while maintaining diagnostic accuracy comparable to experienced clinicians.
Sahin, S.; Diaz, E.; Rajagopal, A.; Abtahi, M.; Jones, S.; Dai, Q.; Kramer, S.; Wang, Z.; Larson, P. E. Z.
Show abstract
Current standard of care imaging practices cannot reliably differentiate among certain renal tumors such as benign oncocytoma and clear cell renal cell carcinoma (RCC), and between low and high grade RCCs. Previous work has explored using deep learning, radiomics, and texture analysis to predict renal tumor subtypes and differentiate between low and high grade RCCs with mixed success. To further this work, large diverse datasets are needed to improve model performance and provide strong evaluation sets. In this work, a dataset of 831 multi-phase 3D CT exams was curated. Each exam contains up to three contrast-enhanced CT phases. Tumor outlines or bounding boxes were annotated and registered to the image volumes. The pathology results for each tumor and relevant patient metadata are also included.